Sequencing

Sequencing of 2650 samples, dual-index 100bp paired-end.

Alignment

After further trimming for PHRED quality score, illumina adapter content and alignment against Mouse Genome GRCm38 also called mm10:

Average input read length after trimming per sample (bp)

Uniquely mapped reads during alignment per sample (%)

Multimapped reads during alignment per sample (%)

Unmapped due to too short reads during alignment per sample (%)

Unmapped due to mismatches reads during alignment per sample (%)

Unmapped-other reads during alignment per sample (%)

Mismatch Rate per sample

Sample selection

## 
##   1   2   3   4   5   6   7   8   9  10 
## 263 429 573 610 250 410  66  10  38   1

Sequencing and Trimming Quality

Mapping Quality

Summary of criteria for quality control

We carefully select the samples that pass the 19 criteria according to the values specified in the table below:

Criteria Selection
#Genes >= 1000 & <= 6500
PercentMito >= 0 & <= 0.006
PercentERCC >= 0 & <= 0.011
adapter_content PASS
Sequences.flagged.as.poor.quality 0
sequence_duplication_levels PASS
avg_input_read_length >= 180 & <= 200
per_base_sequence_quality PASS
sequence_length_distribution PASS
basic_statistics PASS
per_sequence_gc_content PASS
total_reads >= 20000 & <= 4e+06
per_base_n_content PASS
overrepresented_sequences PASS
per_sequence_quality_scores PASS
uniquely_mapped_percent >= 68 & <= 100
unmapped_tooshort_percent >= 0 & <= 17
mismatch_rate >= 0.15 & <= 0.5
multimapped_percent >= 2.3 & <= 7.7

Quantitative criteria are represented here along with the chosen thresholds. Colour per tissue (GM or WM).

From 2650 single-cells sequenced, we end up with 2538 that passed quantitative QC (seen above).